111 research outputs found

    Astro-WISE: Chaining to the Universe

    Get PDF
    The recent explosion of recorded digital data and its processed derivatives threatens to overwhelm researchers when analysing their experimental data or when looking up data items in archives and file systems. While current hardware developments allow to acquire, process and store 100s of terabytes of data at the cost of a modern sports car, the software systems to handle these data are lagging behind. This general problem is recognized and addressed by various scientific communities, e.g., DATAGRID/EGEE federates compute and storage power over the high-energy physical community, while the astronomical community is building an Internet geared Virtual Observatory, connecting archival data. These large projects either focus on a specific distribution aspect or aim to connect many sub-communities and have a relatively long trajectory for setting standards and a common layer. Here, we report "first light" of a very different solution to the problem initiated by a smaller astronomical IT community. It provides the abstract "scientific information layer" which integrates distributed scientific analysis with distributed processing and federated archiving and publishing. By designing new abstractions and mixing in old ones, a Science Information System with fully scalable cornerstones has been achieved, transforming data systems into knowledge systems. This break-through is facilitated by the full end-to-end linking of all dependent data items, which allows full backward chaining from the observer/researcher to the experiment. Key is the notion that information is intrinsic in nature and thus is the data acquired by a scientific experiment. The new abstraction is that software systems guide the user to that intrinsic information by forcing full backward and forward chaining in the data modelling.Comment: To be published in ADASS XVI ASP Conference Series, 2006, R. Shaw, F. Hill and D. Bell, ed

    Merging Grid Technologies

    Get PDF
    This paper reports the integration of the astronomical Grid solution realised in the Astro-WISE information system with the EGEE Grid and the porting of Astro-WISE applications on EGEE. We review the architecture of the Astro-WISE Grid, define the problems for the integration of the Grid infrastructures and our solution to these problems. We give examples of applications running on Astro-WISE and EGEE and review future development of the merged system

    Target and (Astro-)WISE technologies - Data federations and its applications

    Full text link
    After its first implementation in 2003 the Astro-WISE technology has been rolled out in several European countries and is used for the production of the KiDS survey data. In the multi-disciplinary Target initiative this technology, nicknamed WISE technology, has been further applied to a large number of projects. Here, we highlight the data handling of other astronomical applications, such as VLT-MUSE and LOFAR, together with some non-astronomical applications such as the medical projects Lifelines and GLIMPS, the MONK handwritten text recognition system, and business applications, by amongst others, the Target Holding. We describe some of the most important lessons learned and describe the application of the data-centric WISE type of approach to the Science Ground Segment of the Euclid satellite.Comment: 9 pages, 5 figures, Proceedngs IAU Symposium No 325 Astroinformatics 201

    The Euclid Archive Processing and Data Distribution Systems: A Distributed Infrastructure for Euclid and Associated Data

    Get PDF
    The Euclid Archive System is an ambitious information system, which sits at the heart of the Euclid Science Ground Segment. It is a joint development between the Euclid Consortium and the ESAC Science Data Centre. It encompases both Euclid data and the large volume of associated ground based data (e.g. KiDS, DES and LSST). The Euclid Science Ground Segment consists of the Euclid Science Operations Centre and ten national Science Data Centres. The large data volumes demand that data transfer is minimized and that the processing is taken to the data. This is supported by the Euclid Archive Data Processing System and the Euclid Archive Distributed Data System. The Data Processing System consists of a central metadata repository, which contains the information necessary to process any data item and full data lineage of any data product created. The Distributed Data System provides a cloud solution with a node at each of the national Science Data Centres, which controls data storage and transfer. It supports a large number of storage types, including POSIX, iRODS, gridftp and Xrootd. No limitations are placed on the storage implemented at an individual SDC. Further more, the user of the system needs no knowledge of where data is located. Jobs will be started at the most appropriate locations, or data transferred as necessary

    The Role of the Euclid Archive System in the Processing of Euclid and External Data

    Get PDF
    Euclid is an ESA M2 mission which will create a 15,000 square degrees space-based survey: the Euclid Archive System (EAS) is a core element of the Science Ground Segment (SGS) of Euclid. The EAS follows a data-centric approach to data processing, whereby the Data Processing System (DPS) is responsible for the centralized metadata storage and the Distributed Storage System (DSS) supports the distributed storage of data files. The EAS-DPS implements the Euclid Common Data model and along with the EAS-DSS provides numerous services for Euclid Consortium users and SGS subsystems. In addition, the EAS-DPS assists in the preparation of Euclid data releases which are copied to the third EAS subsystem, the ESA developed Science Archive System (SAS) where they become available to the wider astronomical community. The EAS-DPS implements the object-oriented Euclid Common Data Model using a relational DBMS for the storage. The EAS-DPS supports the tracing of the lineage of any data item in the system, provides services for the data quality assessment and the data processing orchestration. The EAS-DSS is a distributed storage system which is based on a set of storage nodes located in each of the ten Science Data Centers of the Euclid SGS. The storage nodes supports a wide range of solutions from local disk, using a unix filesystem, to iRODS nodes or Grid storage elements. In this paper the architectural design of EAS-DPS and EAS-DSS are reviewed: the interaction between them and tests of the already implemented components are described

    The Euclid Archive Processing and Data Distribution Systems: A Distributed Infrastructure for Euclid and Associated Data

    Get PDF
    The Euclid Archive System is an ambitious information system, which sits at the heart of the Euclid Science Ground Segment. It is a joint development between the Euclid Consortium and the ESAC Science Data Centre. It encompases both Euclid data and the large volume of associated ground based data (e.g. KiDS, DES and LSST). The Euclid Science Ground Segment consists of the Euclid Science Operations Centre and ten national Science Data Centres. The large data volumes demand that data transfer is minimized and that the processing is taken to the data. This is supported by the Euclid Archive Data Processing System and the Euclid Archive Distributed Data System. The Data Processing System consists of a central metadata repository, which contains the information necessary to process any data item and full data lineage of any data product created. The Distributed Data System provides a cloud solution with a node at each of the national Science Data Centres, which controls data storage and transfer. It supports a large number of storage types, including POSIX, iRODS, gridftp and Xrootd. No limitations are placed on the storage implemented at an individual SDC. Further more, the user of the system needs no knowledge of where data is located. Jobs will be started at the most appropriate locations, or data transferred as necessary

    The Euclid Archive Processing and Data Distribution Systems: A Distributed Infrastructure for Euclid and Associated Data

    Get PDF
    The Euclid Archive System is an ambitious information system, which sits at the heart of the Euclid Science Ground Segment. It is a joint development between the Euclid Consortium and the ESAC Science Data Centre. It encompases both Euclid data and the large volume of associated ground based data (e.g. KiDS, DES and LSST). The Euclid Science Ground Segment consists of the Euclid Science Operations Centre and ten national Science Data Centres. The large data volumes demand that data transfer is minimized and that the processing is taken to the data. This is supported by the Euclid Archive Data Processing System and the Euclid Archive Distributed Data System. The Data Processing System consists of a central metadata repository, which contains the information necessary to process any data item and full data lineage of any data product created. The Distributed Data System provides a cloud solution with a node at each of the national Science Data Centres, which controls data storage and transfer. It supports a large number of storage types, including POSIX, iRODS, gridftp and Xrootd. No limitations are placed on the storage implemented at an individual SDC. Further more, the user of the system needs no knowledge of where data is located. Jobs will be started at the most appropriate locations, or data transferred as necessary

    The Euclid Archive Processing and Data Distribution Systems: A Distributed Infrastructure for Euclid and Associated Data

    Get PDF
    The Euclid Archive System is an ambitious information system, which sits at the heart of the Euclid Science Ground Segment. It is a joint development between the Euclid Consortium and the ESAC Science Data Centre. It encompases both Euclid data and the large volume of associated ground based data (e.g. KiDS, DES and LSST). The Euclid Science Ground Segment consists of the Euclid Science Operations Centre and ten national Science Data Centres. The large data volumes demand that data transfer is minimized and that the processing is taken to the data. This is supported by the Euclid Archive Data Processing System and the Euclid Archive Distributed Data System. The Data Processing System consists of a central metadata repository, which contains the information necessary to process any data item and full data lineage of any data product created. The Distributed Data System provides a cloud solution with a node at each of the national Science Data Centres, which controls data storage and transfer. It supports a large number of storage types, including POSIX, iRODS, gridftp and Xrootd. No limitations are placed on the storage implemented at an individual SDC. Further more, the user of the system needs no knowledge of where data is located. Jobs will be started at the most appropriate locations, or data transferred as necessary
    • …
    corecore